Fix exponential blow-up in skip_stages on chains of let-bound selects#9147
Fix exponential blow-up in skip_stages on chains of let-bound selects#9147abadams wants to merge 3 commits into
Conversation
In SkipStages::visit(Select), the two branches' per-Func .used / .loaded predicates were combined as `(t_used && cond) || (f_used && !cond)`. When both branches contributed the same Expr -- which is exactly what happens when both branches read the same let-stashed FuncInfo from an outer let -- make_or could not recognise the And nodes as equivalent (they aren't same_as even when their operands are), so the predicate roughly doubled in size at every nested Select. A long chain of CSE'd lets where each let value contains a Select then drove the predicate size to 2^N, well past the point where allocating the IR is feasible. Combine the two branches with `select(cond, t, f)` instead, and add a make_select helper that collapses `select(c, X, X) -> X` and the constant-cond cases. When both branches contributed the same Expr, make_select drops the condition immediately and the chain stays linear. The new correctness test (many_inlined_selects.cpp) constructs a 500- element CSE'd let chain whose values each carry a Param<bool>-gated Select, then feeds the chain into a final Select. With the bug present this test would not terminate -- skip_stages would crash allocating ~2^500 IR nodes long before any reasonable timeout fired. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When an id is only touched on one branch of the Select, the previous code passed an undefined Expr to a `combine` helper that then turned `undefined` into const_false and built a `select(cond, X, false)` -- which is just `X && cond` dressed up as a select. Call make_and directly in those cases and keep make_select for the both-branches case, where the `select(c, X, X) -> X` collapse is the whole point. Also factor the "merge into old" body into a small helper to remove the duplication. No behaviour change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lowercase the Func name, drop the unnecessary top-level select and output schedule, and make each chain entry depend on chain.back() so nothing gets eliminated as dead. The test still reproduces the pre-fix exponential blow-up (verified by reverting the fix: it times out at 30s on a 500-element chain). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #9147 +/- ##
=======================================
Coverage ? 69.77%
=======================================
Files ? 255
Lines ? 77556
Branches ? 18541
=======================================
Hits ? 54118
Misses ? 17962
Partials ? 5476 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| Expr make_select(const Expr &c, const Expr &t, const Expr &f) { | ||
| if (is_const_one(c)) { | ||
| return t; | ||
| } | ||
| if (is_const_zero(c)) { | ||
| return f; | ||
| } | ||
| if (t.same_as(f)) { | ||
| return t; | ||
| } | ||
| return select(c, t, f); | ||
| } |
There was a problem hiding this comment.
This (and the helpers above) seems like the kind of thing that is likely to be useful elsewhere in the compiler, maybe even duplicated already. I suppose we didn't want to call simplify(Select::make(...)) here for some reason.
There was a problem hiding this comment.
True enough, but in this context eager simplification is absolutely required for this to not blow up. If I just made this the default behavior of operator&&/operator||/select I wouldn't have that clear guarantee at this point in the code.
In SkipStages::visit(Select), the two branches' per-Func .used / .loaded predicates were combined as
(t_used && cond) || (f_used && !cond). When both branches contributed the same Expr -- which is exactly what happens when both branches read the same let-stashed FuncInfo from an outer let -- make_or could not recognise the And nodes as equivalent (they aren't same_as even when their operands are), so the predicate roughly doubled in size at every nested Select. A long chain of CSE'd lets where each let value contains a Select then drove the predicate size to 2^N, well past the point where allocating the IR is feasible.Combine the two branches with
select(cond, t, f)instead, and add a make_select helper that collapsesselect(c, X, X) -> Xand the constant-cond cases. When both branches contributed the same Expr, make_select drops the condition immediately and the chain stays linear.The new correctness test (many_inlined_selects.cpp) constructs a 500- element CSE'd let chain whose values each carry a Param-gated Select, then feeds the chain into a final Select. With the bug present this test would not terminate -- skip_stages would crash allocating ~2^500 IR nodes long before any reasonable timeout fired.